NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Deep Learning for Natural Language Processing: A Gentle Introduction

Surdeanu, Mihai; Valenzuela-Escárcega, Marco (February 2024, Cambridge University Press)

Upon encountering this publication, one might ask the obvious question, "Why do we need another deep learning and natural language processing book?" Several excellent ones have been published, covering both theoretical and practical aspects of deep learning and its application to language processing. However, from our experience teaching courses on natural language processing, we argue that, despite their excellent quality, most of these books do not target their most likely readers. The intended reader of this book is one who is skilled in a domain other than machine learning and natural language processing and whose work relies, at least partially, on the automated analysis of large amounts of data, especially textual data. Such experts may include social scientists, political scientists, biomedical scientists, and even computer scientists and computational linguists with limited exposure to machine learning. Existing deep learning and natural language processing books generally fall into two camps. The first camp focuses on the theoretical foundations of deep learning. This is certainly useful to the aforementioned readers, as one should understand the theoretical aspects of a tool before using it. However, these books tend to assume the typical background of a machine learning researcher and, as a consequence, I have often seen students who do not have this background rapidly get lost in such material. To mitigate this issue, the second type of book that exists today focuses on the machine learning practitioner; that is, on how to use deep learning software, with minimal attention paid to the theoretical aspects. We argue that focusing on practical aspects is similarly necessary but not sufficient. Considering that deep learning frameworks and libraries have gotten fairly complex, the chance of misusing them due to theoretical misunderstandings is high. We have commonly seen this problem in our courses, too. This book, therefore, aims to bridge the theoretical and practical aspects of deep learning for natural language processing. We cover the necessary theoretical background and assume minimal machine learning background from the reader. Our aim is that anyone who took introductory linear algebra and calculus courses will be able to follow the theoretical material. To address practical aspects, this book includes pseudo code for the simpler algorithms discussed and actual Python code for the more complicated architectures. The code should be understandable by anyone who has taken a Python programming course. After reading this book, we expect that the reader will have the necessary foundation to immediately begin building real-world, practical natural language processing systems, and to expand their knowledge by reading research publications on these topics. https://doi.org/10.1017/9781009026222
more » « less
Full Text Available
Proceedings of the 2nd Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning (PAN-DL 2023)

Surdeanu, Mihai; Riloff, Ellen; Chiticariu, Laura; Frietag, Dayne; Hahn-Powell, Gus; Morrison, Clayton T; Noriega-Atala, Enrique; Sharp, Rebecca; Valenzuela-Escárcega, Marco (December 2023, Proceedings of the 2nd Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning)

Message from the Organizers Welcome to the second edition of the Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning (Pan-DL)! Our workshop is being organized in a hybrid format on December 6, 2023, in conjunction with the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP). In the past year, the natural language processing (NLP) field (and the world at large!) has been hit by the large language model (LLM) "tsunami." This happened for the right reasons: LLMs perform extremely well in a multitude of NLP tasks, often with minimal training and, perhaps for the first time, have made NLP technology extremely approachable to non-expert users. However, LLMs are not perfect: they are not really explainable, they are not pliable, i.e., they cannot be easily modified to correct any errors observed, and they are not efficient due to the overhead of decoding. In contrast, rule-based methods are more transparent to subject matter experts; they are amenable to having a human in the loop through intervention, manipulation and incorporation of domain knowledge; and further the resulting systems tend to be lightweight and fast. This workshop focuses on all aspects of rule-based approaches, including their application, representation, and interpretability, as well as their strengths and weaknesses relative to state-of-the-art machine learning approaches. Considering the large number of potential directions in this neuro-symbolic space, we emphasized inclusivity in our workshop. We received 19 submissions and accepted 10 for oral presentation. This resulted in an overall acceptance rate of 52%. Our workshop also includes 6 presentations of papers that were accepted in Findings of EMNLP. In addition to the oral presentations of the accepted papers, our workshop includes a keynote talk by Yunyao Li, who has made many important contributions to the field of symbolic approaches for natural language processing. Further, the workshop contains a panel that will discuss the merits and limitations of rules in the new LLM era. The panelists will be academics with expertise in both neural- and rulebased methods, industry experts that employ these methods for commercial products, and subject matter experts that have used rule-based methods for domain-specific applications. We thank Yunyao Li and the panelists for their important contribution to our workshop! Finally, we are thankful to the members of the program committee for their insightful reviews! We are confident that all submissions have benefited from their expert feedback. Their contribution was a key factor for accepting a diverse and high-quality list of papers, which we hope will make the first edition of the Pan-DL workshop a success, and will motivate many future editions. Pan-DL 2023 Organizers December 6, 2023
more » « less
Full Text Available
Neural-Guided Program Synthesis of Information Extraction Rules Using Self-Supervision

Noriega-Atala, Enrique; Vacareanu, Robert; Hahn-Powell, Gus; Valenzuela-Escárcega, Marco A. (October 2022, Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning)

We propose a neural-based approach for rule synthesis designed to help bridge the gap between the interpretability, precision and maintainability exhibited by rule-based information extraction systems with the scalability and convenience of statistical information extraction systems. This is achieved by avoiding placing the burden of learning another specialized language on domain experts and instead asking them to provide a small set of examples in the form of highlighted spans of text. We introduce a transformer-based architecture that drives a rule synthesis system that leverages a self-supervised approach for pre-training a large-scale language model complemented by an analysis of different loss functions and aggregation mechanisms for variable length sequences of user-annotated spans of text. The results are encouraging and point to different desirable properties, such as speed and quality, depending on the choice of loss and aggregation method.
more » « less
Full Text Available
From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction

Vacareanu, Robert; Valenzuela-Escárcega, Marco A.; Barbosa, George C.; Sharp, Rebecca; Surdeanu, Mihai (June 2022, LREC proceedings)

While deep learning approaches to information extraction have had many successes, they can be difficult to augment or maintain as needs shift. Rule-based methods, on the other hand, can be more easily modified. However, crafting rules requires expertise in linguistics and the domain of interest, making it infeasible for most users. Here we attempt to combine the advantages of these two directions while mitigating their drawbacks. We adapt recent advances from the adjacent field of program synthesis to information extraction, synthesizing rules from provided examples. We use a transformer-based architecture to guide an enumerative search, and show that this reduces the number of steps that need to be explored before a rule is found. Further, we show that our synthesized rules achieve state-of-the-art performance on the 1-shot scenario of a task that focuses on few-shot learning for relation classification, and competitive performance in the 5-shot scenario.
more » « less
Full Text Available
Proceedings of Pattern-based Approaches to NLP in the Age of Deep Learning (PAN-DL)

Chiticariu, Laura; Goldberg, Yoav; Hahn-Powell, Gus; Morrison, Clayton T; Naik, Aakanksha; Sharp, Rebecca; Surdeanu, Mihai; Valenzuela-Escárcega, Marco; Noriega-Atala, Enrique (October 2022, Proceedings of the First Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning)

Message from the Organizers Welcome to the first edition of the Workshop on Pattern-based Approaches to NLP in the Age of Deep Learning (Pan-DL)! Our workshop is being organized online on October 17, 2022, in conjunction with the 29th International Conference on Computational Linguistics (COLING). We all know that deep-learning methods have dominated the field of natural language processing in the past decade. However, these approaches usually rely on the availability of high-quality and high- quantity data annotation. Furthermore, the learned models are difficult to interpret and incur substantial technical debt. As a result, these approaches tend to exclude users that lack the necessary machine learning background. In contrast, rule-based methods are easier to deploy and adapt; they support human examination of intermediate representations and reasoning steps; they are more transparent to subject- matter experts; they are amenable to having a human in the loop through intervention, manipulation and incorporation of domain knowledge; and further the resulting systems tend to be lightweight and fast. This workshop focuses on all aspects of rule-based approaches, including their application, representation, and interpretability, as well as their strengths and weaknesses relative to state-of-the-art machine learning approaches. Considering the large number of potential directions in this neuro-symbolic space, we emphasized inclusivity in our workshop. We received 13 papers and accepted 10 for oral presentation. This resulted in an overall acceptance rate of 77%. In addition of the oral presentations of the accepted papers, our workshop includes a keynote talk by Ellen Riloff, who has made crucial contributions to the field of natural language processing, many of which are at the intersection of rule- and neural-based methods. Further, the workshop contains a panel that will discuss the merits and limitations of rules in our neural era. The panelists will be academics with expertise in both neural- and rule-based methods, industry experts that employ these methods for commercial products, government officials in charge of AI funding, organizers of natural language processing evaluations, and subject matter experts that have used rule-based methods for domain-specific applications. We thank Ellen Riloff and the panelists for their important contribution to our workshop! Finally, we are thankful to the members of the program committee for their insightful reviews! We are confident that all submissions have benefited from their expert feedback. Their contribution was a key factor for accepting a diverse and high-quality list of papers, which we hope will make the first edition of the Pan-DL workshop a success, and will motivate many future editions. Pan-DL 2022 Organizers October 2022
more » « less
Full Text Available
A Human-machine Interface for Few-shot Rule Synthesis for Information Extraction

Vacareanu, Robert; Barbosa, George C.; Noriega-Atala, Enrique; Hahn-Powell, Gus; Sharp, Rebecca; Valenzuela-Escárcega, Marco A.; Surdeanu, Mihai (July 2022, NAACL)

We propose a system that assists a user in constructing transparent information extraction models, consisting of patterns (or rules) written in a declarative language, through program synthesis. Users of our system can specify their requirements through the use of examples, which are collected with a search interface. The rule-synthesis system proposes rule candidates and the results of applying them on a textual corpus; the user has the option to accept the candidate, request another option, or adjust the examples provided to the system. Through an interactive evaluation, we show that our approach generates high-precision rules even in a 1-shot setting. On a second evaluation on a widely-used relation extraction dataset (TACRED), our method generates rules that outperform considerably manually written patterns. Our code, demo, and documentation is available at https://clulab.github.io/odinsynth/.
more » « less
Full Text Available
Enabling Search and Collaborative Assembly of Causal Interactions Extracted from Multilingual and Multi-domain Free Text

https://doi.org/10.18653/v1/n19-4003

Barbosa, George C.; Wong, Zechy; Hahn-Powell, Gus; Bell, Dane; Sharp, Rebecca; Valenzuela-Escárcega, Marco A.; Surdeanu, Mihai (January 2019, Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies)

Full Text Available
Learning what to read: Focused machine reading

https://doi.org/10.18653/v1/D17-1313

Noriega-Atala, Enrique; Valenzuela-Escárcega, Marco A.; Morrison, Clayton; Surdeanu, Mihai (September 2017, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing)

Recent efforts in bioinformatics have achieved tremendous progress in the machine reading of biomedical literature, and the assembly of the extracted biochemical interactions into large-scale models such as protein signaling pathways. However, batch machine reading of literature at today’s scale (PubMed alone indexes over 1 million papers per year) is unfeasible due to both cost and processing overhead. In this work, we introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible. We introduce a family of algorithms for focused reading, including an intuitive, strong baseline, and a second approach which uses a reinforcement learning (RL) framework that learns when to explore (widen the search) or exploit (narrow it). We demonstrate that the RL approach is capable of answering more queries than the baseline, while being more efficient, i.e., reading fewer documents.
more » « less
Full Text Available
Eidos, INDRA, & Delphi: From Free Text to Executable Causal Models

https://doi.org/10.18653/v1/N19-4008

Sharp, Rebecca; Pyarelal, Adarsh; Gyori, Benjamin; Alcock, Keith; Laparra, Egoitz; Valenzuela-Escárcega, Marco A.; Nagesh, Ajay; Yadav, Vikas; Bachman, John; Tang, Zheng; et al (June 2019, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations))
null (Ed.)
Full Text Available
Large-scale automated machine reading discovers new cancer-driving mechanisms

https://doi.org/10.1093/database/bay098

Valenzuela-Escárcega, Marco A; Babur, Özgün; Hahn-Powell, Gus; Bell, Dane; Hicks, Thomas; Noriega-Atala, Enrique; Wang, Xia; Surdeanu, Mihai; Demir, Emek; Morrison, Clayton T (January 2018, Database)
null (Ed.)
Full Text Available

Search for: All records